# REPLICATION FILES FOR: 

Hainmueller, Jens; Marbach, Moritz; Hangartner, Dominik; Harder, Niklas; Vallizadeh, Ehsan. "Refugee Labor Market Integration at Scale: Evidence from Germany's Fast-Track Employment Program." Forthcoming in Proceedings of the National Academy of Sciences (PNAS).

For questions about the replication files, please contact: jhain@stanford.edu and m.marbach@ucl.ac.uk


## Overview 

The underlying administrative data are provided by the Federal Employment Agency (Bundesagentur für Arbeit, BA). There are several constraints affecting the public release of the data. 

1. The counseling contact counts are internal controlling data of the Federal Employment Agency. Due to the confidential nature of these records, any of the contact measures cannot be included in a public-release dataset. 

2. All data is subject to German data protection regulations, which impose strict minimum cell-size requirements to prevent the identification of individuals in small demographic subgroups at the job center level. As most of our outcome variables are functions of small cell counts for at least some months, we are limited in what we can include in a public-release dataset. 

To release some data compliant with data protection requirements, we have constructed a demonstration dataset for one of our main outcome variables: exit-to-job. This dataset can be used to run our main analysis for a subset of job centers for which the underlying cell-size requirements are met. The results are very similar to those documented in the paper using the full data. The demonstration data, the scripts and the outputs are located in `./minimal`. For further details, see below. 

Researchers wishing to replicate all results with the full data should contact the authors to discuss data access arrangements with the BA. We provide the complete R code used to produce all results in the paper and appendix. These scripts are located in the directory `./full` and cover data processing and analyses. We also share a codebook of the analysis data (`./full/usedata/codebook.csv`). To run the full analysis, call `3_0_main_script.R`. 

## Replication with Demonstration Data 

File: `./minimal/jobturbo_public.Rdata`

To comply with data protection requirements, we retain only job centers where both the numerator (exits to employment) and the denominator (lagged unemployment stock) are at least 5 in every month for all four nationality groups (Germans, other immigrants, other refugees from the top-8 origin countries, and Ukrainian refugees). This filter reduces the sample from 300 BA-operated job centers to 30. Optionskommunen (municipally operated job centers) are excluded entirely, as they are not part of the main analysis. To further protect confidentiality, the dataset contains only the exit-to-job rate, not the underlying counts. This prevents back-calculation of small cell values from other released variables.

The demonstration data is a balanced panel of 30 job centers observed over 35 monthly periods (October 2022 -- August 2025). Each job center has four rows per month, one for each nationality group, yielding 4,200 observations.

- `jc_id`:  Anonymized job center identifier 
- `nat`:  Nationality group: `DE` (Germans), `sAUSL` (other immigrants), `8HKL` (refugees from top-8 origin countries), `UKR` (Ukrainian refugees)
- `ym`:  Year-month (Date)
- `jc_nat_id`:  Panel unit identifier (`jc_id_nat`), used as the cross-sectional index in `fect()`
- `y_exit_job_full_rate`:  Exit-to-job rate: monthly exits to employment divided by the lagged unemployment stock

### Scripts

- `0_make_public_data.R`: Constructs the public-release dataset from the internal research data. This script is included for transparency but cannot be run without access to the confidential source data.

- `1_replicate_main_results.R`: Replicates the main results using the public-release data. This script is self-contained and can be run directly on `jobturbo_public.Rdata`. It produces:
    - Descriptive trends (exit-to-job rate, raw and indexed): `fig_desc_trends_exit.pdf`
    - ATT estimation via Interactive Fixed Effects (IFE): `fig_att_ife_{sAUSL,DE}.pdf`, estimates in CSV
    - ATT estimation via Matrix Completion (MC): `fig_att_mc_{sAUSL,DE}.pdf`, estimates in CSV
    - ATT estimation via TWFE Imputation: `fig_att_twfe_{sAUSL,DE}.pdf`, estimates in CSV
    - Placebo tests: `fig_placebo_exit_{comparison}.pdf`, estimates in CSV

All output is saved to the `outputs/` subdirectory.



### R sessionInfo() 

    R version 4.5.2 (2025-10-31)
    Platform: aarch64-apple-darwin25.0.0
    Running under: macOS Tahoe 26.2

    Matrix products: default
    BLAS:   /opt/homebrew/Cellar/openblas/0.3.31_1/lib/libopenblasp-r0.3.31.dylib 
    LAPACK: /opt/homebrew/Cellar/r/4.5.2_1/lib/R/lib/libRlapack.dylib;  LAPACK version 3.12.1

    locale:
    [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

    time zone: GMT
    tzcode source: internal

    attached base packages:
    [1] grid      stats     graphics  grDevices utils     datasets  methods  
    [8] base     

    other attached packages:
    [1] ggtext_0.1.2      fect_2.0.5        patchwork_1.3.1   viridis_0.6.5    
    [5] viridisLite_0.4.2 scales_1.4.0      lubridate_1.9.4   forcats_1.0.0    
    [9] stringr_1.6.0     dplyr_1.2.0       purrr_1.2.1       readr_2.1.5      
    [13] tidyr_1.3.2       tibble_3.3.1      ggplot2_4.0.0     tidyverse_2.0.0  

    loaded via a namespace (and not attached):
    [1] gtable_0.3.6        xfun_0.56           GGally_2.4.0       
    [4] lattice_0.22-7      tzdb_0.5.0          numDeriv_2016.8-1.1
    [7] vctrs_0.7.1         tools_4.5.2         generics_0.1.4     
    [10] parallel_4.5.2      sandwich_3.1-1      pacman_0.5.1       
    [13] pkgconfig_2.0.3     RColorBrewer_1.1-3  S7_0.2.1           
    [16] rngtools_1.5.2      stringmagic_1.2.0   lifecycle_1.0.5    
    [19] compiler_4.5.2      farver_2.1.2        textshaping_1.0.1  
    [22] codetools_0.2-20    litedown_0.7        Formula_1.2-5      
    [25] pillar_1.11.1       MASS_7.3-65         doRNG_1.8.6.2      
    [28] iterators_1.0.14    abind_1.4-8         foreach_1.5.2      
    [31] nlme_3.1-168        parallelly_1.45.1   commonmark_1.9.5   
    [34] ggstats_0.11.0      tidyselect_1.2.1    digest_0.6.39      
    [37] mvtnorm_1.3-3       stringi_1.8.7       future_1.67.0      
    [40] reshape2_1.4.4      listenv_0.9.1       labeling_0.4.3     
    [43] cli_3.6.5           magrittr_2.0.4      future.apply_1.20.0
    [46] withr_3.0.2         dreamerr_1.5.0      timechange_0.3.0   
    [49] globals_0.18.0      gridExtra_2.3       ragg_1.4.0         
    [52] zoo_1.8-14          hms_1.1.3           fixest_0.13.2      
    [55] doParallel_1.0.17   markdown_2.0        rlang_1.1.7        
    [58] gridtext_0.1.5      Rcpp_1.1.1          glue_1.8.0         
    [61] xml2_1.4.1          R6_2.6.1            plyr_1.8.9         
    [64] systemfonts_1.3.1  